Fast Training Convergence Using Mixtures of Hidden Markov Models

نویسنده

  • Graham Grindlay
چکیده

In this paper, we describe a method for mixing an arbitrary number of discrete fixed-structure Hidden Markov Models such that we retain their collective training experience. We show that, when presented with a novel data sequence, this new mixture model converges in significantly fewer iterations than either a randomly initialized model or any one of the mixture’s component models. We also explore the notion of long-term memory to further exploit the component models’ collective training experience. Several general approaches to integrating memory into the parameter updates are considered, including decaying the expectations derived in the E step of online-EM, and windowing techniques where we use a variable sized data sample. Our experiments show that using mixtures of HMMs as initial parameter settings can speed convergence significantly. In some cases, the mixture models required as few as 30% of the iterations required for a randomly initialized model. Additionally, we find that mixture models show significant improvements in the ability to adapt to data produced by non-stationary distributions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Introducing Busy Customer Portfolio Using Hidden Markov Model

Due to the effective role of Markov models in customer relationship management (CRM), there is a lack of comprehensive literature review which contains all related literatures. In this paper the focus is on academic databases to find all the articles that had been published in 2011 and earlier. One hundred articles were identified and reviewed to find direct relevance for applying Markov models...

متن کامل

On the Convergence Rate of Random Permutation Sampler and ECR Algorithm in Missing Data Models

Label switching is a well-known phenomenon that occurs in MCMC outputs targeting the parameters’ posterior distribution of many latent variable models. Although its appearence is necessary for the convergence of the simulated Markov chain, it turns out to be a problem in the estimation procedure. In a recent paper, Papastamoulis and Iliopoulos (2010) introduced the Equivalence Classes Represent...

متن کامل

Improving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM

Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...

متن کامل

Intrusion Detection Using Evolutionary Hidden Markov Model

Intrusion detection systems are responsible for diagnosing and detecting any unauthorized use of the system, exploitation or destruction, which is able to prevent cyber-attacks using the network package analysis. one of the major challenges in the use of these tools is lack of educational patterns of attacks on the part of the engine analysis; engine failure that caused the complete training,  ...

متن کامل

Maximum a Posteriori Parameter

An iterative stochastic algorithm to perform maximum a posteriori parameter estimation of hidden Markov models is proposed. It makes the most of the statistical model by introducing an artiicial probability model based on an increasing number of the unobserved Markov chain at each iteration. Under minor regularity assumptions, we provide suucient conditions to ensure global convergence of this ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003